oLLM is a Python library for running large-context Transformers on NVIDIA GPUs by offloading weights and KV-cache to SSDs. It supports models like Llama-3, GPT-OSS-20B, and Qwen3-Next-80B, enabling up to 100K tokens of context on 8-10 GB GPUs without quantization.
The Model Context Protocol (MCP) is a new open protocol that allows AI models to interact with external systems in a standardized, extensible way. In this tutorial, you’ll install MCP, explore its client-server architecture, and work with its core concepts: prompts, resources, and tools.
A curated collection of Awesome LLM apps built with RAG, AI Agents, Multi-agent Teams, MCP, Voice Agents, and more. This repository features LLM apps that use models from OpenAI, Anthropic, Google, xAI and open-source models like Qwen or Llama.
An open source web crawler that searches the internet. It's a minimal, real-time web search CLI that searches the internet for you. Enter a query and get search results as JSON (title, url, published_date), sorted by recency.
frozen-in-time version of our Paper Finder agent for reproducing evaluation results. This repo contains the code for the standalone Paper Finder agent. PaperFinder is our paper-seeking agent, which is intended to assist in locating sets of papers according to content-based and metadata criteria.
This GitHub repository directory contains resources for evaluating Large Language Models (LLMs), including a Jupyter Notebook demonstrating how to use LLM Arena as a judge and a Python script for the same purpose. It also includes a README file with instructions on how to view the notebook if it doesn't render correctly on GitHub.
A technical article explaining how a small change in async Python code—using a semaphore to limit concurrency—reduced LLM request volume and costs by 90% without sacrificing performance.
Google has introduced LangExtract, an open-source Python library designed to help developers extract structured information from unstructured text using large language models such as the Gemini models. The library simplifies the process of converting free-form text into structured data, offering features like controlled generation, text chunking, parallel processing, and integration with various LLMs.
AI-powered multi-agent system that automatically analyzes codebases and generates comprehensive documentation. Features GitLab integration, concurrent processing, and multiple LLM support for better code understanding and developer onboarding.
This GitHub repository contains a collection of example files demonstrating various use cases and configurations for the llamafiles tools, including examples:
* **System Administration:** Scripts and configurations for Ubuntu, Raspberry Pi 5, and macOS.
* **LLM Interaction:** Examples of prompts and interactions with LLMs like Mixtral and Dolphin.
* **Text Processing:** Scripts for summarizing text, extracting information, and formatting output.
* **Development Tools:** Examples related to Git, Emacs, and other development tools.
* **Hardware Monitoring:** Scripts for monitoring GPU and NVMe drive status.